Tensor-Based Semantically-Aware Topic Clustering of Biomedical Documents
نویسندگان
چکیده
منابع مشابه
Tensor-Based Semantically-Aware Topic Clustering of Biomedical Documents
Biomedicine is a pillar of the collective, scientific effort of human self-discovery, as well as a major source of humanistic data codified primarily in biomedical documents. Despite their rigid structure, maintaining and updating a considerably-sized collection of such documents is a task of overwhelming complexity mandating efficient information retrieval for the purpose of the integration of...
متن کاملA Dynamic and Semantically-Aware Technique for Document Clustering in Biomedical Literature
As an unsupervised learning process, document clustering has been used to improve information retrieval performance by grouping similar documents and to help text mining approaches by providing a high-quality input for them. In this paper, the authors propose a novel hybrid clustering technique that incorporates semantic smoothing of document models into a neural network framework. Recently, it...
متن کاملSemantically-Guided Clustering of Text Documents via Frequent Subgraphs Discovery
In this paper we introduce and analyze two improvements to GDClust [1], a system for document clustering based on the co-occurrence of frequent subgraphs. GDClust (Graph-Based Document Clustering) works with frequent senses derived from the constraints provided by the natural language rather than working with the co-occurrences of frequent keywords commonly used in the vector space model of doc...
متن کاملClustering View-Segmented Documents via Tensor Modeling
We propose a clustering framework for view-segmented documents, i.e., relatively long documents made up of smaller fragments that can be provided according to a target set of views or aspects. The framework is designed to exploit a view-based document segmentation into a third-order tensor model, whose decomposition result would enable any standard document clustering algorithm to better reflec...
متن کاملExtracting Topic Words and Clustering Documents by Probabilistic Graphical Models
ABSTRACT We present a method for lustering do uments and extra ting topi words of ea h luster using a probabilisti graphi al model. We maximize the likelihood of the model with the Expe tation Maximization algorithm. Our experiments demonstrate that the latent variables of the model an be seen as lusters of do uments and terms.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computation
سال: 2017
ISSN: 2079-3197
DOI: 10.3390/computation5030034